Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Stat Methods Med Res ; : 9622802241244613, 2024 Apr 09.
Article in English | MEDLINE | ID: mdl-38594934

ABSTRACT

This paper aims to extend the Besag model, a widely used Bayesian spatial model in disease mapping, to a non-stationary spatial model for irregular lattice-type data. The goal is to improve the model's ability to capture complex spatial dependence patterns and increase interpretability. The proposed model uses multiple precision parameters, accounting for different intensities of spatial dependence in different sub-regions. We derive a joint penalized complexity prior to the flexible local precision parameters to prevent overfitting and ensure contraction to the stationary model at a user-defined rate. The proposed methodology can be used as a basis for the development of various other non-stationary effects over other domains such as time. An accompanying R package fbesag equips the reader with the necessary tools for immediate use and application. We illustrate the novelty of the proposal by modeling the risk of dengue in Brazil, where the stationary spatial assumption fails and interesting risk profiles are estimated when accounting for spatial non-stationary. Additionally, we model different causes of death in Brazil, where we use the new model to investigate the spatial stationarity of these causes.

2.
R Soc Open Sci ; 11(1): 230851, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38179076

ABSTRACT

Statistical analysis based on quantile methods is more comprehensive, flexible and less sensitive to outliers when compared to mean methods. Joint disease mapping is useful for inferring correlation between different diseases. Most studies investigate this link through multiple correlated mean regressions. We propose a joint quantile regression framework for multiple diseases where different quantile levels can be considered. We are motivated by the theorized link between the presence of malaria and the gene deficiency G6PD, where medical scientists have anecdotally discovered a possible link between high levels of G6PD and lower than expected levels of malaria initially pointing towards the occurrence of G6PD inhibiting the occurrence of malaria. Thus, the need for flexible joint quantile regression in a disease mapping framework arises. Our model can be used for linear and nonlinear effects of covariates by stochastic splines since we define it as a latent Gaussian model. We perform Bayesian inference using the R integrated nested Laplace approximation, suitable even for large datasets. Finally, we illustrate the model's applicability by considering data from 21 countries, although better data are needed to prove a significant relationship. The proposed methodology offers a framework for future studies of interrelated disease phenomena.

3.
Biostatistics ; 25(2): 429-448, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-37531620

ABSTRACT

Modeling longitudinal and survival data jointly offers many advantages such as addressing measurement error and missing data in the longitudinal processes, understanding and quantifying the association between the longitudinal markers and the survival events, and predicting the risk of events based on the longitudinal markers. A joint model involves multiple submodels (one for each longitudinal/survival outcome) usually linked together through correlated or shared random effects. Their estimation is computationally expensive (particularly due to a multidimensional integration of the likelihood over the random effects distribution) so that inference methods become rapidly intractable, and restricts applications of joint models to a small number of longitudinal markers and/or random effects. We introduce a Bayesian approximation based on the integrated nested Laplace approximation algorithm implemented in the R package R-INLA to alleviate the computational burden and allow the estimation of multivariate joint models with fewer restrictions. Our simulation studies show that R-INLA substantially reduces the computation time and the variability of the parameter estimates compared with alternative estimation strategies. We further apply the methodology to analyze five longitudinal markers (3 continuous, 1 count, 1 binary, and 16 random effects) and competing risks of death and transplantation in a clinical trial on primary biliary cholangitis. R-INLA provides a fast and reliable inference technique for applying joint models to the complex multivariate data encountered in health research.


Subject(s)
Algorithms , Models, Statistical , Humans , Bayes Theorem , Computer Simulation , Monte Carlo Method , Longitudinal Studies
4.
Lancet Glob Health ; 11(10): e1519-e1530, 2023 10.
Article in English | MEDLINE | ID: mdl-37734797

ABSTRACT

BACKGROUND: Differences in mortality exist between sexes because of biological, genetic, and social factors. Sex differentials are well documented in children younger than 5 years but have not been systematically examined for ages 5-24 years. We aimed to estimate the sex ratio of mortality from birth to age 24 years and reconstruct trends in sex-specific mortality between 1990 and 2021 for 200 countries, major regions, and the world. METHODS: We compiled comprehensive databases on the mortality sex ratio (ratio of male to female mortality rates) for individuals aged 0-4 years, 5-14 years, and 15-24 years. The databases contain mortality rates from death registration systems, full birth and sibling histories from surveys, and reports on household deaths in censuses. We modelled the sex ratio of age-specific mortality as a function of the mortality in both sexes using Bayesian hierarchical time-series models. We report the levels and trends of sex ratios and estimate the expected female mortality and excess female mortality rates (the difference between the estimated female mortality and the expected female mortality) to identify countries with outlying sex ratios. FINDINGS: Globally, the mortality sex ratio was 1·13 (ie, boys were more likely to die than girls of the same age) for ages 0-4 years (90% uncertainty interval 1·11 to 1·15) in 2021. This ratio increased with age to 1·16 (1·12 to 1·20) for 5-14 years, reaching 1·65 for 15-24 years (1·52 to 1·75). In all age groups, the global sex ratio of mortality increased between 1990 and 2021, driven by faster declines in female mortality. In 2021, the probability of a newborn male reaching age 25 years was 94·1% (93·7 to 94·4), compared with 95·1% for a newborn female (94·7 to 95·3). We found a disadvantage of females versus males (compared with countries with similar total mortality) in 2021 in five countries for ages 0-4 years (Algeria, Bangladesh, Egypt, India, and Iran), one country (Suriname) for ages 5-14 years, and 13 countries for ages 15-24 years (including Bangladesh and India). We found the reverse pattern (disadvantage of males vs females compared with countries of similar total mortality) in one country in ages 0-4 years (Vietnam) and eight countries in ages 15-24 years (including Brazil and Mexico). Globally, the number of excess female deaths from birth to age 24 years was 86 563 (-6059 to 164 000) in 2021, down from 544 636 (453 982 to 633 265) in 1990. INTERPRETATION: The global sex ratio of mortality for all age groups in the first 25 years of life increased between 1990 and 2021. Targeted interventions should focus on countries with outlying sex ratios of mortality to reduce disparities due to discrimination in health care, nutrition, and violence. FUNDING: The Bill & Melinda Gates Foundation, US Agency for International Development, and King Abdullah University of Science and Technology.


Subject(s)
Sex Characteristics , Sexual Behavior , Infant, Newborn , Humans , Female , Adolescent , Child , Male , Bayes Theorem , Bangladesh , Brazil
5.
Biom J ; 65(4): e2100322, 2023 04.
Article in English | MEDLINE | ID: mdl-36846925

ABSTRACT

Two-part joint models for a longitudinal semicontinuous biomarker and a terminal event have been recently introduced based on frequentist estimation. The biomarker distribution is decomposed into a probability of positive value and the expected value among positive values. Shared random effects can represent the association structure between the biomarker and the terminal event. The computational burden increases compared to standard joint models with a single regression model for the biomarker. In this context, the frequentist estimation implemented in the R package frailtypack can be challenging for complex models (i.e., a large number of parameters and dimension of the random effects). As an alternative, we propose a Bayesian estimation of two-part joint models based on the Integrated Nested Laplace Approximation (INLA) algorithm to alleviate the computational burden and fit more complex models. Our simulation studies confirm that INLA provides accurate approximation of posterior estimates and to reduced computation time and variability of estimates compared to frailtypack in the situations considered. We contrast the Bayesian and frequentist approaches in the analysis of two randomized cancer clinical trials (GERCOR and PRIME studies), where INLA has a reduced variability for the association between the biomarker and the risk of event. Moreover, the Bayesian approach was able to characterize subgroups of patients associated with different responses to treatment in the PRIME study. Our study suggests that the Bayesian approach using the INLA algorithm enables to fit complex joint models that might be of interest in a wide range of clinical applications.


Subject(s)
Models, Statistical , Neoplasms , Humans , Bayes Theorem , Computer Simulation , Algorithms
6.
Stat Methods Med Res ; 31(8): 1566-1578, 2022 08.
Article in English | MEDLINE | ID: mdl-35585712

ABSTRACT

Bayesian disease mapping, yet if undeniably useful to describe variation in risk over time and space, comes with the hurdle of prior elicitation on hard-to-interpret random effect precision parameters. We introduce a reparametrized version of the popular spatio-temporal interaction models, based on Kronecker product intrinsic Gaussian Markov random fields, that we name the variance partitioning model. The variance partitioning model includes a mixing parameter that balances the contribution of the main and interaction effects to the total (generalized) variance and enhances interpretability. The use of a penalized complexity prior on the mixing parameter aids in coding prior information in an intuitive way. We illustrate the advantages of the variance partitioning model using two case studies.


Subject(s)
Models, Statistical , Bayes Theorem
7.
Biom J ; 63(8): 1555-1574, 2021 12.
Article in English | MEDLINE | ID: mdl-34378223

ABSTRACT

In recent years, Bayesian meta-analysis expressed by a normal-normal hierarchical model (NNHM) has been widely used for combining evidence from multiple studies. Data provided for the NNHM are frequently based on a small number of studies and on uncertain within-study standard deviation values. Despite the widespread use of Bayesian NNHM, it has always been unclear to what extent the posterior inference is impacted by the heterogeneity prior (sensitivity S ) and by the uncertainty in the within-study standard deviation values (identification I ). Thus, to answer this question, we developed a unified method to simultaneously quantify both sensitivity and identification ( S - I ) for all model parameters in a Bayesian NNHM, based on derivatives of the Bhattacharyya coefficient with respect to relative latent model complexity (RLMC) perturbations. Three case studies exemplify the applicability of the method proposed: historical data for a conventional therapy, data from which one large study is first included and then excluded, and two subgroup meta-analyses specified by their randomization status. We analyzed six scenarios, crossing three RLMC targets with two heterogeneity priors (half-normal, half-Cauchy). The results show that S - I explicitly reveals which parameters are affected by the heterogeneity prior and by the uncertainty in the within-study standard deviation values. In addition, we compare the impact of both heterogeneity priors and quantify how S - I values are affected by omitting one large study and by the randomization status. Finally, the range of applicability of S - I is extended to Bayesian NtHM. A dedicated R package facilitates automatic S - I quantification in applied Bayesian meta-analyses.


Subject(s)
Bayes Theorem , Uncertainty
8.
Lancet Planet Health ; 5(4): e209-e219, 2021 04.
Article in English | MEDLINE | ID: mdl-33838736

ABSTRACT

BACKGROUND: Temperature and rainfall patterns are known to influence seasonal patterns of dengue transmission. However, the effect of severe drought and extremely wet conditions on the timing and intensity of dengue epidemics is poorly understood. In this study, we aimed to quantify the non-linear and delayed effects of extreme hydrometeorological hazards on dengue risk by level of urbanisation in Brazil using a spatiotemporal model. METHODS: We combined distributed lag non-linear models with a spatiotemporal Bayesian hierarchical model framework to determine the exposure-lag-response association between the relative risk (RR) of dengue and a drought severity index. We fit the model to monthly dengue case data for the 558 microregions of Brazil between January, 2001, and January, 2019, accounting for unobserved confounding factors, spatial autocorrelation, seasonality, and interannual variability. We assessed the variation in RR by level of urbanisation through an interaction between the drought severity index and urbanisation. We also assessed the effect of hydrometeorological hazards on dengue risk in areas with a high frequency of water supply shortages. FINDINGS: The dataset included 12 895 293 dengue cases reported between 2001 and 2019 in Brazil. Overall, the risk of dengue increased between 0-3 months after extremely wet conditions (maximum RR at 1 month lag 1·56 [95% CI 1·41-1·73]) and 3-5 months after drought conditions (maximum RR at 4 months lag 1·43 [1·22-1·67]). Including a linear interaction between the drought severity index and level of urbanisation improved the model fit and showed the risk of dengue was higher in more rural areas than highly urbanised areas during extremely wet conditions (maximum RR 1·77 [1·32-2·37] at 0 months lag vs maximum RR 1·58 [1·39-1·81] at 2 months lag), but higher in highly urbanised areas than rural areas after extreme drought (maximum RR 1·60 [1·33-1·92] vs 1·15 [1·08-1·22], both at 4 months lag). We also found the dengue risk following extreme drought was higher in areas that had a higher frequency of water supply shortages. INTERPRETATION: Wet conditions and extreme drought can increase the risk of dengue with different delays. The risk associated with extremely wet conditions was higher in more rural areas and the risk associated with extreme drought was exacerbated in highly urbanised areas, which have water shortages and intermittent water supply during droughts. These findings have implications for targeting mosquito control activities in poorly serviced urban areas, not only during the wet and warm season, but also during drought periods. FUNDING: Royal Society, Medical Research Council, Wellcome Trust, National Institutes of Health, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, and Conselho Nacional de Desenvolvimento Científico e Tecnológico. TRANSLATION: For the Portuguese translation of the abstract see Supplementary Materials section.


Subject(s)
Dengue , Urbanization , Bayes Theorem , Brazil/epidemiology , Dengue/epidemiology , Humans , Temperature , United States
9.
Biom J ; 62(4): 1105-1119, 2020 07.
Article in English | MEDLINE | ID: mdl-32011763

ABSTRACT

We propose a Bayesian spatiotemporal statistical model for predicting out-of-hospital cardiac arrests (OHCAs). Risk maps for Ticino, adjusted for demographic covariates, are built for explaining and forecasting the spatial distribution of OHCAs and their temporal dynamics. The occurrence intensity of the OHCA event in each area of interest, and the cardiac risk-based clustering of municipalities are efficiently estimated, through a statistical model that decomposes OHCA intensity into overall intensity, demographic fixed effects, spatially structured and unstructured random effects, time polynomial dependence, and spatiotemporal random effect. In the studied geography, time evolution and dependence on demographic features are robust over different categories of OHCAs, but with variability in their spatial and spatiotemporal structure. Two main OHCA incidence-based clusters of municipalities are identified.


Subject(s)
Biometry/methods , Models, Statistical , Out-of-Hospital Cardiac Arrest/epidemiology , Aged , Bayes Theorem , Cities/epidemiology , Demography , Female , Humans , Male , Middle Aged , Risk , Spatio-Temporal Analysis
10.
Spat Spatiotemporal Epidemiol ; 32: 100319, 2020 02.
Article in English | MEDLINE | ID: mdl-32007284

ABSTRACT

The main goal of disease mapping is to estimate disease risk and identify high-risk areas. Such analyses are hampered by the limited geographical resolution of the available data. Typically the available data are counts per spatial unit and the common approach is the Besag-York-Mollié (BYM) model. When precise geocodes are available, it is more natural to use Log-Gaussian Cox processes (LGCPs). In a simulation study mimicking childhood leukaemia incidence using actual residential locations of all children in the canton of Zürich, Switzerland, we compare the ability of these models to recover risk surfaces and identify high-risk areas. We then apply both approaches to actual data on childhood leukaemia incidence in the canton of Zürich during 1985-2015. We found that LGCPs outperform BYM models in almost all scenarios considered. Our findings suggest that there are important gains to be made from the use of LGCPs in spatial epidemiology.


Subject(s)
Leukemia/epidemiology , Models, Statistical , Adolescent , Child , Child, Preschool , Female , Humans , Infant , Leukemia/etiology , Male , Spatio-Temporal Analysis , Switzerland/epidemiology
11.
Stat Med ; 38(5): 778-791, 2019 02 28.
Article in English | MEDLINE | ID: mdl-30334278

ABSTRACT

Models of excess mortality with random effects were used to estimate regional variation in relative or net survival of cancer patients. Statistical inference for these models based on the Markov chain Monte Carlo (MCMC) methods is computationally intensive and, therefore, not feasible for routine analyses of cancer register data. This study assessed the performance of the integrated nested Laplace approximation (INLA) in monitoring regional variation in cancer survival. Poisson regression model of excess mortality including both spatially correlated and unstructured random effects was fitted to the data of patients diagnosed with ovarian and breast cancer in Finland during 1955-2014 with follow up from 1960 through 2014 by using the period approach with five-year calendar time windows. We estimated standard deviations associated with variation (i) between hospital districts and (ii) between municipalities within hospital districts. Posterior estimates based on the INLA approach were compared to those based on the MCMC simulation. The estimates of the variation parameters were similar between the two approaches. Variation within hospital districts dominated in the total variation between municipalities. In 2000-2014, the proportion of the average variation within hospital districts was 68% (95% posterior interval: 35%-93%) and 82% (60%-98%) out of the total variation in ovarian and breast cancer, respectively. In the estimation of regional variation, the INLA approach was accurate, fast, and easy to implement by using the R-INLA package.


Subject(s)
Breast Neoplasms/mortality , Demography/statistics & numerical data , Models, Statistical , Ovarian Neoplasms/mortality , Small-Area Analysis , Survival Analysis , Cities/statistics & numerical data , Female , Finland , Hospitals/statistics & numerical data , Humans , Poisson Distribution , Registries
12.
Spat Spatiotemporal Epidemiol ; 26: 25-34, 2018 08.
Article in English | MEDLINE | ID: mdl-30390932

ABSTRACT

In this note we discuss (Gaussian) intrinsic conditional autoregressive (CAR) models for disconnected graphs, with the aim of providing practical guidelines for how these models should be defined, scaled and implemented. We show how these suggestions can be implemented in two examples, on disease mapping.


Subject(s)
Models, Statistical , Spatio-Temporal Analysis , Data Interpretation, Statistical , Humans , Italy/epidemiology , Lip Neoplasms/epidemiology , Scotland/epidemiology , Stomach Neoplasms/epidemiology
13.
Stat Med ; 36(19): 3039-3058, 2017 Aug 30.
Article in English | MEDLINE | ID: mdl-28474394

ABSTRACT

In a bivariate meta-analysis, the number of diagnostic studies involved is often very low so that frequentist methods may result in problems. Using Bayesian inference is particularly attractive as informative priors that add a small amount of information can stabilise the analysis without overwhelming the data. However, Bayesian analysis is often computationally demanding and the selection of the prior for the covariance matrix of the bivariate structure is crucial with little data. The integrated nested Laplace approximations method provides an efficient solution to the computational issues by avoiding any sampling, but the important question of priors remain. We explore the penalised complexity (PC) prior framework for specifying informative priors for the variance parameters and the correlation parameter. PC priors facilitate model interpretation and hyperparameter specification as expert knowledge can be incorporated intuitively. We conduct a simulation study to compare the properties and behaviour of differently defined PC priors to currently used priors in the field. The simulation study shows that the PC prior seems beneficial for the variance parameters. The use of PC priors for the correlation parameter results in more precise estimates when specified in a sensible neighbourhood around the truth. To investigate the usage of PC priors in practice, we reanalyse a meta-analysis using the telomerase marker for the diagnosis of bladder cancer and compare the results with those obtained by other commonly used modelling approaches. Copyright © 2017 John Wiley & Sons, Ltd.


Subject(s)
Bayes Theorem , Diagnostic Tests, Routine , Meta-Analysis as Topic , Bias , Biometry/methods , Computer Simulation , Humans , Sensitivity and Specificity , Telomere , Urinary Bladder Neoplasms/diagnosis , Urinary Bladder Neoplasms/genetics
14.
Stat Methods Med Res ; 25(4): 1145-65, 2016 08.
Article in English | MEDLINE | ID: mdl-27566770

ABSTRACT

In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM (Besag, York and Mollié) model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding; however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. The hyperparameters themselves are used to define flexible extensions of simple base models. Consequently, penalised complexity priors for these parameters can be derived based on the information-theoretic distance from the flexible model to the base model, giving priors with clear interpretation. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning.


Subject(s)
Bayes Theorem , Epidemiological Monitoring , Markov Chains , Normal Distribution
15.
Stat Med ; 35(11): 1848-65, 2016 May 20.
Article in English | MEDLINE | ID: mdl-26530705

ABSTRACT

In recent years, the availability of infectious disease counts in time and space has increased, and consequently, there has been renewed interest in model formulation for such data. In this paper, we describe a model that was motivated by the need to analyze hand, foot, and mouth disease surveillance data in China. The data are aggregated by geographical areas and by week, with the aims of the analysis being to gain insight into the space-time dynamics and to make short-term predictions, which will aid in the implementation of public health campaigns in those areas with a large predicted disease burden. The model we develop decomposes disease-risk into marginal spatial and temporal components and a space-time interaction piece. The latter is the crucial element, and we use a tensor product spline model with a Markov random field prior on the coefficients of the basis functions. The model can be formulated as a Gaussian Markov random field and so fast computation can be carried out using the integrated nested Laplace approximation approach. A simulation study shows that the model can pick up complex space-time structure and our analysis of hand, foot, and mouth disease data in the central north region of China provides new insights into the dynamics of the disease.


Subject(s)
Bayes Theorem , Hand, Foot and Mouth Disease/epidemiology , Child , China/epidemiology , Computer Simulation , Disease Outbreaks , Female , Humans , Male , Markov Chains , Poisson Distribution , Population Surveillance , Risk Factors
16.
Biometrics ; 71(1): 208-217, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25257036

ABSTRACT

The Northern Humboldt Current System (NHCS) is the world's most productive ecosystem in terms of fish. In particular, the Peruvian anchovy (Engraulis ringens) is the major prey of the main top predators, like seabirds, fish, humans, and other mammals. In this context, it is important to understand the dynamics of the anchovy distribution to preserve it as well as to exploit its economic capacities. Using the data collected by the "Instituto del Mar del Perú" (IMARPE) during a scientific survey in 2005, we present a statistical analysis that has as main goals: (i) to adapt to the characteristics of the sampled data, such as spatial dependence, high proportions of zeros and big size of samples; (ii) to provide important insights on the dynamics of the anchovy population; and (iii) to propose a model for estimation and prediction of anchovy biomass in the NHCS offshore from Perú. These data were analyzed in a Bayesian framework using the integrated nested Laplace approximation (INLA) method. Further, to select the best model and to study the predictive power of each model, we performed model comparisons and predictive checks, respectively. Finally, we carried out a Bayesian spatial influence diagnostic for the preferred model.


Subject(s)
Bayes Theorem , Biomass , Biometry/methods , Data Interpretation, Statistical , Fishes/physiology , Models, Statistical , Algorithms , Animals , Computer Simulation , Environmental Monitoring/methods , Peru , Reproducibility of Results , Sample Size , Sensitivity and Specificity
17.
Spat Spatiotemporal Epidemiol ; 4: 33-49, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23481252

ABSTRACT

During the last three decades, Bayesian methods have developed greatly in the field of epidemiology. Their main challenge focusses around computation, but the advent of Markov Chain Monte Carlo methods (MCMC) and in particular of the WinBUGS software has opened the doors of Bayesian modelling to the wide research community. However model complexity and database dimension still remain a constraint. Recently the use of Gaussian random fields has become increasingly popular in epidemiology as very often epidemiological data are characterised by a spatial and/or temporal structure which needs to be taken into account in the inferential process. The Integrated Nested Laplace Approximation (INLA) approach has been developed as a computationally efficient alternative to MCMC and the availability of an R package (R-INLA) allows researchers to easily apply this method. In this paper we review the INLA approach and present some applications on spatial and spatio-temporal data.


Subject(s)
Models, Statistical , Software , Spatio-Temporal Analysis , Humans
18.
Spat Spatiotemporal Epidemiol ; 7: 39-55, 2013 Dec.
Article in English | MEDLINE | ID: mdl-24377114

ABSTRACT

During the last three decades, Bayesian methods have developed greatly in the field of epidemiology. Their main challenge focusses around computation, but the advent of Markov Chain Monte Carlo methods (MCMC) and in particular of the WinBUGS software has opened the doors of Bayesian modelling to the wide research community. However model complexity and database dimension still remain a constraint. Recently the use of Gaussian random fields has become increasingly popular in epidemiology as very often epidemiological data are characterised by a spatial and/or temporal structure which needs to be taken into account in the inferential process. The Integrated Nested Laplace Approximation (INLA) approach has been developed as a computationally efficient alternative to MCMC and the availability of an R package (R-INLA) allows researchers to easily apply this method. In this paper we review the INLA approach and present some applications on spatial and spatio-temporal data.


Subject(s)
Bayes Theorem , Epidemiologic Methods , Models, Statistical , Spatio-Temporal Analysis , Stochastic Processes
19.
Biostatistics ; 14(1): 113-28, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22988280

ABSTRACT

Next generation sequencing is quickly replacing microarrays as a technique to probe different molecular levels of the cell, such as DNA or RNA. The technology provides higher resolution, while reducing bias. RNA sequencing results in counts of RNA strands. This type of data imposes new statistical challenges. We present a novel, generic approach to model and analyze such data. Our approach aims at large flexibility of the likelihood (count) model and the regression model alike. Hence, a variety of count models is supported, such as the popular NB model, which accounts for overdispersion. In addition, complex, non-balanced designs and random effects are accommodated. Like some other methods, our method provides shrinkage of dispersion-related parameters. However, we extend it by enabling joint shrinkage of parameters, including those for which inference is desired. We argue that this is essential for Bayesian multiplicity correction. Shrinkage is effectuated by empirically estimating priors. We discuss several parametric (mixture) and non-parametric priors and develop procedures to estimate (parameters of) those. Inference is provided by means of local and Bayesian false discovery rates. We illustrate our method on several simulations and two data sets, also to compare it with other methods. Model- and data-based simulations show substantial improvements in the sensitivity at the given specificity. The data motivate the use of the ZI-NB as a powerful alternative to the NB, which results in higher detection rates for low-count data. Finally, compared with other methods, the results on small sample subsets are more reproducible when validated on their large sample complements, illustrating the importance of the type of shrinkage.


Subject(s)
Bayes Theorem , Data Interpretation, Statistical , Models, Statistical , RNA/chemistry , Sequence Analysis, RNA/methods , Base Sequence , Computer Simulation , Molecular Sequence Data , RNA/genetics
20.
Stat Methods Med Res ; 21(5): 479-507, 2012 Oct.
Article in English | MEDLINE | ID: mdl-22544855

ABSTRACT

This article presents a methodology for modeling aggregated disease incidence data with the spatially continuous log-Gaussian Cox process. Statistical models for spatially aggregated disease incidence data usually assign the same relative risk to all individuals in the same reporting region (census areas or postal regions). A further assumption that the relative risks in two regions are independent given their neighbor's risks (the Markov assumption) makes the commonly used Besag-York-Mollié model computationally simple. The continuous model proposed here uses a data augmentation step to sample from the posterior distribution of the exact locations at each step of an Markov chain Monte Carlo algorithm, and models the exact locations with an log-Gaussian Cox process. A simulation study shows the log-Gaussian Cox process model consistently outperforming the Besag-York-Mollié model. The method is illustrated by making inference on the spatial distribution of syphilis risk in North Carolina. The effect of several known social risk factors are estimated, and areas with risk well in excess of that expected given these risk factors are identified.


Subject(s)
Models, Statistical , Humans , Incidence , North Carolina/epidemiology , Risk Factors , Syphilis/epidemiology
SELECTION OF CITATIONS
SEARCH DETAIL
...